Increasing Feature Selection Accuracy for L1 Regularized Linear Models
نویسندگان
چکیده
L1 (also referred to as the 1-norm or Lasso) penalty based formulations have been shown to be effective in problem domains when noisy features are present. However, the L1 penalty does not give favorable asymptotic properties with respect to feature selection, and has been shown to be inconsistent as a feature selection estimator; e.g. when noisy features are correlated with the relevant features. This can affect the estimation of the correct feature set, in certain domains like robotics, when both the number of examples and the number of features are large. The weighted lasso penalty by (Zou, 2006) has been proposed to rectify this problem of correct estimation of the feature set. This paper proposes a novel method for identifying problem specific L1 feature weights by utilizing the results from (Zou, 2006) and (Rocha et al., 2009) and is applicable to regression and classification algorithms. Our method increases the accuracy of L1 penalized algorithms through randomized experiments on subsets of the training data as a fast pre-processing step. We show experimental and theoretical results supporting the efficacy of the proposed method on two L1 penalized classification algorithms.
منابع مشابه
L1-regularization path algorithm for generalized linear models
We introduce a path following algorithm for L1-regularized generalized linear models. The L1-regularization procedure is useful especially because it, in effect, selects variables according to the amount of penalization on the L1-norm of the coefficients, in a manner that is less greedy than forward selection–backward deletion. The generalized linear model path algorithm efficiently computes so...
متن کاملModel selection for linear classifiers using Bayesian error estimation
Regularized linear models are important classification methods for high dimensional problems, where regularized linear classifiers are often preferred due to their ability to avoid overfitting. The degree of freedom of the model is determined by a regularization parameter, which is typically selected using counting based approaches, such as K-fold cross-validation. For large data, this can be v...
متن کاملA Comparison of Optimization Methods for Large-scale L1-regularized Linear Classification
Large-scale linear classification is widely used in many areas. The L1-regularized form can be applied for feature selection, but its non-differentiability causes more difficulties in training. Various optimization methods have been proposed in recent years, but no serious comparison among them has been made. In this paper, we discuss several state of the art methods and propose two new impleme...
متن کاملA Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification
Large-scale linear classification is widely used in many areas. The L1-regularized form can be applied for feature selection; however, its non-differentiability causes more difficulties in training. Although various optimization methods have been proposed in recent years, these have not yet been compared suitably. In this paper, we first broadly review existing methods. Then, we discuss state-o...
متن کاملLearning Networks of Stochastic Differential Equations
We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval T . We analyze the l1-regularized least squares algorithm and, in the setting in which the underl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010